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ABSTRACT 

Restriction  and  prolongation  operators  are  used  to  provide  a unified  frame- 
work for  the  discussion  of  errors  in  approximating  evolutionary  equations.  A 
generalized  truncation  error  enables  the  spline-Galerkin  method  to  be  studied  in 
detail  and  the  accuracy  of  various  treatments  of  non-linear  terms  (such  as  the 
advection  operator  v ^ compared:  it  is  shown  how  a multi-stage  Galerkin  pro- 
cess can  give  errors  which  are  0(h^^)  for  splines  of  order  p and  quite  general 
differential  operators.  A Petrov-Galerkin  method  is  derived  for  8 = a3  u which 
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is  accurate  and  stable. 
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SIGNIFICANCE  AND  EXPANATION 

Most  problems  in  continuum  mechanics  (fluid  flow,  combustion,  elasticity) 
involve  partial  differential  equations  that  can  be  solved  only  numerically  by 
computer.  Originally  most  numerical  methods  for  this  class  of  problem  used 
finite  differences,  which  involve  direct  replacement  of  derivatives  by  difference 
formulae.  In  the  last  ten  or  fifteen  years,  finite  element  methods  have  become 
increasingly  popular.  These  involve  starting  from  an  assumed  functional  form  for 
the  unknowns  (e.g.  piecewise  polynomial) , then  determining  parameters  in  the  func- 
tional representation  via  satisfying  the  partial  differential  equation  in  some 
approximate  sense. 

As  finite  element  methods  become  increasingly  used  for  non-steady  problems, 
it  becomes  important  that  their  performance  can  be  compared  in  detail  with  that 
of  the  longer  established  finite  difference  methods.  This  paper  sets  up  a frame- 
work in  which  this  can  be  done.  Analysis  of  the  spline-Galerkin  methods  is 
carried  out  and  highly  accurate  schemes  put  forward  for  the  advection  operator 
v*^v  , which  occurs  in  the  equations  for  practical  problems  in  which  motion  of 
fluid  and  material  particles  are  involved. 


The  responsibility  for  the  wording  and  views  expressed  in  this  descriptive  summary 
lies  with  MRC,  and  not  with  the  authors  of  this  report. 
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M.  J.  P.  Cullen  and  K.  W.  Morton 


1.  INTRODUCTION 

Any  step-by-step  procedure  for  approximating  an  initial-value  problem  u^  = Lu  or 
u^^  = Lu  consists  of  three  stages:  discretisation  of  the  initial  data;  updating  the  dis- 
crete approximant  to  match  the  evolution  of  the  true  solution;  and  interpretation  of  the 
final  result.  In  this  paper,  we  present  a general  framework  for  describing  and  analysing 
such  procedures  based  on: the  restriction  and  prolongation  operators  introduced  by  Aubin  (1) 
for  studying  elliptic  problems  and  used  by  Noble  [20]  and  others  for  Fredholm  integral  equa- 
tions. The  first  stage  of  a procedure  naturally  entails  the  use  of  a restriction  operator 

r , the  second  an  evolution  operator  E,  and  the  last  a prolongation  p.  . Our  objective  is 
n h h 

a practical  one:  to  provide  a framework  within  which  in  particular  finite  element,  finite 
difference  and  spectral  methods  can  be  compared  in  detail  and  the  development  of  hybrid 
methods  encouraged.  Detailed  comparisons  mainly  refer  to  the  advection  operator 

Following  Swartz  and  Wendroff  [28]  , Douglas  and  Dupont  [8]  and  Dupont  [10] , the 
standard  error  analysis  of  Galerkin  methods  when  Lu  is  linear,  elliptic  and  negative  de- 
finite introduces  an  elliptic  projection  of  the  solution  Pu  and  estimates  the  rate  of 
evolution  of  the  difference  Pu  - 0,  where  U is  the  Galerkin  approximation.  Much  of  the 

theory  for  elliptic  problems  can  then  be  invoked  to  show  that  this  "evolutionary  error"  has 

k 2 

the  optimal  rate  of  convergence  0(h)  in  the  L norm,  where  h is  a space  discretisation 
parameter  and  the  approximation  space  or  trial  space  has  order  of  accuracy  h . The 
analysis  is  based  on  energy  estimates  and  is  paurticularly  appropriate  when  L is  self-adjoint. 
For  then  the  Ritz  projection  Pu  is  in  a definite  sense  the  best  approximation  to  u and 
gives  a firm  basis  of  comparison  for  U.  But  the  approach  can  also  be  used  for  more  general 
elliptic  operators  (see  Douglas,  Dupont  and  Wheeler  [9])  and  for  some  non-linear  problems 
(see,  for  instance,  Wheeler  [34]  and  Dendy  [7]). 


Supported  by  the  Meteorological  Office,  in  partial  fulfillment  of  the  reauirements 
for  a Ph.D.  at  University  of  Reading. 
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However,  as  the  departure  from  self-adjointness  is  increased,  Pu 


becomes  a less 


and  less  good  approximation  to  u or,  if  a weighted  projection  is  used,  the  penalty  is 
paid  in  an  extra  growth  term  for  Pu  - U.  Coercivity  of  L is  enough  to  obtain  optimal 
order  of  accuracy  but  the  constants  may  be  large.  Indeed,  at  the  extreme  of  first  order 
hyperbolic  equations  even  this  may  be  lost,  as  was  shown  by  the  example  of  Dupont  [111  - even 
though  Dendy  [6]  demonstrated  how  this  could  be  recovered.  Nevertheless,  the  general  approach 
is  still  valid  in  all  cases:  U should  be  compared  with  a projection  of  u into  the  approxi- 
mation space  and  much  information  is  lost  if  it  is  con^iared  directly  with  u.  In  integrating 
over  any  reasonable  length  of  time,  the  evolutionary  error  is  dominant  and  the  most  useful 
schemes  will  often  exhibit  phenomena  of  superconvergence  which  can  best  be  studied  through 
such  comparisons.  Unfortunately  the  standard  analysis  precludes  this  by  the  presence  of  a 
term  P3^u  - 3^U.  We  present  an  alternative  analysis  which  places  more  emphasis  on  the  dis- 
crete procedure:  it  is  quite  general  both  as  regard  to  type  of  procedure  and  the  projection 
used  in  the  comparison  and,  by  means  of  a generalised  truncation  error,  allows  superconver- 
gence of  the  evolutionary  error  to  be  easily  studied. 

A particularly  interesting  case  of  superconvergence  was  exhibited  by  Thomee  (32)  and 
Thomee  and  Wendroff  [33)  . They  showed  that  linear  elements  used  with  a Galerkin  procedure 
on  first  order  linear  hyperbolic  equations  could  be  interpreted  as  giving  a finite  difference 
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scheme  with  0(h  ) accuracy,  and  more  generally  splines  of  order  g gave  an  accuracy  of 

0(h^^).  We  shall  show  that  this  is  true  under  any  projection  and  that  with  orthogonal  pro- 
2 

tion  in  L the  super convergence  continues  to  hold  with  non-linear  terms  like  v ^u.  This 
view-point  draws  attention  to  the  possibility  of  evaluating  such  a product  in  a number  of 
different  ways  - pointwise  multiplication,  simple  Galerkin  and  two-stage  Galerkin  - which 
have  very  different  error  characteristics.  In  numerical  experiments  on  the  shallow  water 
equations,  Cullen  [3,4)  found  that  the  non-conservative  two-stage  Galerkin  process  gave  very 
much  better  results  than  the  single  stage  process:  the  analysis  indicates  why  this  is  so. 
Similarly,  higher  order  differential  operators  led  to  a loss  of  accuracy  in  Thomee 's 
analysis  - for  instance,  only  0(h  ) for  the  heat  equation  and  linear  elements:  we  shall 
show  that  a multistage  process  can  retain  the  full  0{h^^)  accuracy  for  any  order,  at  the 
cost  of  a less  compact  scheme. 
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The  plan  of  the  rest  of  the  paper  is  as  follows.  In  section  2,  restriction  and 
prolongation  operators  are  introduced  and  the  unified  framework  for  analysing  evolutionary 
error  is  developed.  This  is  illustrated  in  section  3 on  a simple  finite  element  problem  and 
applied  in  section  4 to  the  semi-discrete  spline  Galerkin  method.  In  both  sections  Fourier 
analysis  is  the  main  tool  for  estimating  the  errors.  Section  5 places  finite  difference 
methods  in  the  same  frcunework,  drawing  attention  to  the  optimal  recovery  problem  arising  at 
the  final  prolongation.  A Petrov-Galerkin  method  for  u^  = au^  is  proposed  in  section  6 
which  is  motivated  by  reference  to  characteristics,  shown  to  be  stable  by  an  energy  analysis 
and  its  accuracy  determined  by  Fourier  analysis.  Finally,  in  section  7,  the  results  as  they 
pertain  to  the  advection  operator  are  drawn  together,  the  importance  of  the  restriction  oper- 
ator in  damping  short-wave  length  modes  emphasised  and  the  accuracy  of  an  irregular  mesh  con- 
sidered. 


2.  unified  analysis  of  evolutionary  error 


We  consider  pure  initial- value  problems  in  the  form 

3^u  - Lu  * f on  (0,T]  X 3?“^ 

u(0,x)  »=  u°(x)  , 


(2.1) 


where  u is  vector-valued  (and  may  be  complex  in  Fourier  analysis)  and  L is  a differen- 
tial operator  in  , which  may  be  non-linear  but  has  real  coefficients  and,  li)ce  f,  does 
not  depend  explicitly  on  t.  It  is  most  convenient  to  wor)c  in  a separable  Hilbert  space  V 
with  norm  (1*11  and  inner  product  <• , •>  and  denote  by  u(t)  the  mapping  u:  (0,Tl  -►  V. 

We  assume  that  L generates  a strong  evolution  operator  E(t)  :V  -►  V , i ^ 0,  so  that  we 
may  write 

u{t+T)  = E(T)u(t)  , T i 0 . (2.2) 

(See,  e.g.,  Kato  [16]  for  conditions  under  which  this  may  be  established  and  Tartar  [30]  for 
a brief  introduction.) 

Following  Aubin  [1],  we  associate  the  triplet  (V.  ,p.  ,r.  ) with  any  procedure  for 

h h h 

approximating  members  of  the  solution  space  V on  a discrete  mesh  in  which  is  charac- 
terised by  a parameter  h > 0.  The  discrete  space  with  norm  ll"||jj>  consists  of  ele- 

ments Uj^  which  atre  finite  dimensional  vectors  of  parameters  defining  an  approximation  to 
u;  the  restriction  operator  '^h  ® continuous  mapping  identifying  this  relation- 
ship; and  the  prolongation  operator  p^^  is  an  isomorphism  from  to  a closed  subspace 

Sj^  of  V,  called  the  approximation  space.  In  a typical  situation,  night  consist  of 

2 

continuous  piecewise  linear  functions  over  a triangulation  of  K , u^^  the  set  of  nodal 

values  at  the  vertices,  r^^u  those  values  obtained  from  a least  squares  fit  to  u and 

P.  r u the  resultant  linear  approximation.  For  a given  prolongation  p.  , we  can  define 
n n n 

the  optimal  restriction  (in  V)  , denoted  bv  r.  » such  that 

h 

l|u  - ull  = inf  l|u  - PwV.  II  , Vu  e V . (2.3) 
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It  follows  that 


■^hPh 


= I. 


= 1 


(2.4) 


where  I.  is  the  identity  in  V,  . In  fact  we  shall  always  ensure  that  r,p,  = I,  / so  that 
h h h h h 

p.  r.  is  a projection,  and  p.  r,  is  the  orthogonal  projector:  also  the  usual  discrete  norm 
h h h h 

is  given  by  shall  always  assume  that  is  uniformly  bound- 
ed as  h -*•  0.  Similarly  for  a given  r^^,  is  called  an  optimal  prolongation  if 

■^h^h'^h  IIVhil=  Ill'll  ' '^^'h  ^ '"h  • 

{veV|rj^v=Vj^) 


Part  of  the  convenience  of  wor)cing  in  a Hilbert  space  is  that  a unique  p exists  for  each 

h 

rj^  and  satisfies  a dual  property 

II"  ■ Vhll  ' ^ ^ ■ <2.6) 

A semi-discrete  approximation  to  (2.1)  consists  of  a one  parameter  family  u^(t),  that 
is  a mapping  [0,T]  .»•  satisfying  a system  of  ordinary  differential  equations 

^t^  - = ^h  ' ^ ^ 

where  '*  some  sense  approximates  L and  f^  approximates  f.  We  shall  normally 

assume  that  u (0)  = r u^. 


We  shall  consider  only  those  fully-discrete  approximations  whic)i  are  defined  on  time- 

levels  ” ^0  ^'1  <'n  <•••<  *-f]  ~ shall  denote  jay  a superscript  n both  the 

value  of  u(t)  at  t and  its  approximation  u",  satisfying  a one-step  procedure 
n h 


n+1 


,(n)  n 
^h  "h 


(2.8) 


Here  :V  -►  V approximates  the  evolution  operator  £(t  ,-t  ) and  again  we  shall 

h h h n+1  ii 

normally  tajce  u^  = r.u*^.  Multi-step  schemes  may  (ae  included  in  the  formulation  as  follows: 
h h 

multiple  sets  of  parameters,  which  may  joe  identifiable  with  intermediate  time  levels,  are 
included  in  the  specification  of  V ; prolongations  p,  can  still  refer  to  a single  main 


time-level  but  the  definitions  of  restrictions  r.  must  be  extended  to  feunilies 

h 

{u(t),  0 _<  t £ T}  and  the  optimality  definitions  in  (2.3)  and  (2.6)  referred  to  u(t)  e V 
for  each  t . 

(a)  Evolutionary  error  in  the  semi-discrete  case 

We  suppose  p,  and  r,  to  be  time- independent.  How  they  are  chosen  will  depend  on 
n n 

the  discrete  method  and  the  analysis:  a Galerkin  method  usually  implies  a prolongation  and 

a difference  method  may  imply  a restriction;  in  either,  a prolongation  is  iitplied  if 

is  compared  to  u,  and  a restriction  if  u.  compared  to  r,u.  For  simplicity  of  notation, 

h h 

we  shall  often  denote  Pj^Uj^(t)  by  U(t)  , and  the  parameters  in  “j^(t)  by  (t)  . Then 

the  projection  p,  r,  enables  us  to  split  the  error  as 
h h 

u - U = (I  - P,r^)u  + p,  (r.u  - u ) . (2.9) 

h h h h h 

The  first  term  on  the  right  is  purely  an  approximation  error  in  the  space  and  can  be 

estimated  from  approximation  theory.  The  second  term,  called  the  evolutionary  error,  is 
of  greatest  interest:  we  shall  use  the  notation 


® = Ph®h 


We  consider  first  the  usual  estimation  procedure  when  u,  is  generated  by  a Galerkin 

h 

process  and  L is  linear  and  coercive.  Then,  interpreting  such  expressions  as 
a distributional  sense,  we  decompose  the  difference  between  (2.1)  and  the  prolongation  of 


(2.7)  as 


V ° 'Ph’^h'"’ V ^ ^ '^Ph'PhH.'"h  ^ '^'Ph^h’  • 

The  first  and  last  terms  on  the  right  are  again  approximation  errors  and  the  decomposition 

is  aimed  at  isolating  the  middle  terms.  For  a Galerkin  process,  defined  with 

L.  « r.  Lp.  , f.  “ r.f  so  that,  for  real  u,  f and  L,  we  have 
n h h h h 


'‘PhVh PhV  - '^Ph"h  ^ ° • 


If,  in  addition,  the  restriction  is  to  be  defined  such  that  Pj^r^^  is  the  elliptic  pro- 
jection derived  from  L,  we  make  the  further  splitting 


-6- 


1 


L(u  - p.  u ) = L(u  - p,  r.  u)  + Le 
n h h h 


< L(u  - p.  r.  u)  , e > * 0 . 

n n 


Hence  from  (2.11)  we  obtain  the  energy  estimate 


= (3^e,e>  =(Le,e>  + <(Pj^rj^-I)  3^u,e> 


— ||e||  +a||e||  1 ||  3^,u  H (2.16) 

if  < Le  ,e  > £ - a ||e  II  ^ . 

This  decomposition  is  heavily  oriented  towards  the  Galerkin  process  and  elliptic  pro- 
jection and  the  second  term  in  (2.11)  will  often  be  difficult  to  estimate.  It  also  has  the 
disadvantage  that  the  first  term  is  present  throughout  and  precludes  the  immediate  study  of 
superconvergence  phenomena.  We  adopt  instead  a decomposition  which  places  more  emphasis  on 
the  discrete  operator  Thus  we  obtain  from  (2.7)  and  the  restriction  of  (2.1) 

■ <SV-Vh>  = ^ 

reducing,  when  L and  L.  are  linear,  to 

h 

3 e - L e = (r  L-L.r  )u  + (r.  f-f,  ) . (2.18) 

th  hh  hhh  hh 


The  terms  on  the  right  in  (2.17)  and  (2.18)  we  call  the  truncation  error  (T.E.).  It  is  the 
)cey  term  in  our  analysis  and  may  be  estimated  in  a variety  of  ways.  In  particular,  in  the 


case  treated  above  we  have  from  the  Galerkin  process 

<PhVh"-''Vh"'®>  = ° ' <PhV"'^>  = ° ' <2.19) 

which  gives  the  intermediate  result 

<Ph<Vh-Vh>'^>  = <'PhV-^Ph’^h>'^  ^ <PhV^>^'^>  ■ <2-20) 

Then,  of  course,  when  we  introduce  r^^  so  that  is  the  elliptic  projector,  we  recover 


(2.15).  But  in  general  we  shall  prefer  to  retain  (2.18)  and  concentrate  attention  on  the 
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difference  of  operators  t.  is  more  general  and  more  precise  and  can  give  point- 

h n n 

wise  bounds. 

(b)  Evolutionary  error  in  the  discrete  case 

For  notational  simplicity  we  consider  the  case  where  = At,  a constant,  and 

denote  by  the  iterated  operator  E^  over  s time  steps.  Then  with  the  usual  regroup- 

ing of  terms  we  obtain  from  (2.2)  and  (2.8) 

0 n 0 ,„n  0 „n  0, 

e,  = r,  (E(At))  u - E,  r u + (E,  r u - E.u  ) 
hh  hh  hh  hh 

= [rj^E(At)u""^  - Ej^rj^u"”^)  + ... 

n-s-1  „s+l  n-s-1, 

...  + [E,r,E(At)u  - E,  r,u  ] + ... 

h h h h 

...  + [E^'^rj^E(At)u°  - E||r^u°]  + (E|Jr^u°  - e[|u|J)  . (2.21) 

We  assume  that  the  approximate  procedure  is  stable:  that  is,  that  there  exist  constants  Yq 
and  K such  that 


Y-m  At 

^ llv"hilh  ' ^'''h  " \ 


(2.22) 


Then  we  have 


n-1  Y-SAt 


s=0 


h h 


Y t 

Ke  ° " { ||rj^u°-Uj^||j^  + t^  sup  ||  (At)  ^ frj^E(At) -Ej^rj^]u(t)  ||^}  , (2.23) 

^ ^'n 

in  a form  similar  to  that  in  standard  finite  difference  analysis  (see  Richtmyer  and  Morton 
(231)  . 

In  most  practical  problems  in  several  space  dimensions,  the  error  in  the  time  dis- 
cretisation is  of  much  less  concern  than  that  in  the  spatial  variables.  It  is  then  appro- 
priate to  estimate  errors  from  (2.18)  rather  than  from  (2.23),  which  is  also  simpler.  Note 
that  if  the  time  discretisation  is  unimportant  and  Euler's  method  is  assumed,  then 
Ej^Uj^  ” “h  * ’ formally  too,  E(At)u  -»  u + At[Lu+f]  which  then  relates  (2.23)  to 
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(2.18).  For  conditions  under  which  this  secon.i  limit  is  valid  the  reader  should  consult 
the  references  on  evolutionary  equations  already  cited.  Note,  too,  that  the  stability  con- 


% 


dition  on  the  discrete  scheme  is  replaced  by  the  presence  of  the  term  - L,  e on  the  left 

h n 

of  (2.18). 

Use  of  Fourier  analysis 

Fourier  analysis  gives  the  most  precise  insight  into  the  local  behavior  of  approxima- 
tions and  is  therefore  generally  used  in  comparative  assessment  of  methods  on  simple  model 
problems  - see,  e.g.,  Swartz  and  Wendroff  [291.  It  has  long  been  the  most  important  tool  in 
the  study  of  the  stability  of  difference  schemes  and,  although  subsequently  rejected  by 
Strang  and  Fix  [27]  in  their  analysis  of  the  finite  element  method,  their  Fourier  analysis 
of  the  method  on  a uniform  mesh,  given  in  an  abreviated  form  by  Strang  [26] , is  very  illu- 
minating and  forms  the  basis  of  the  analysis  in  section  4.  The  difficulties  arise  when  one 
wishes  to  be  both  precise  and  rigorous  outside  the  simple  case  of  linear,  constant  coeffi- 
cient problems  on  a uniform  mesh.  However,  both  in  differential  equation  theory  and  the 

stability  theory  of  difference  schemes,  many  of  the  results  from  Fourier  analysis  have  been 

transferred  to  more  general  situations  and  much  can  be  done  here  too. 

In  the  simple  case,  and  assuming  too  that  the  restriction  operator  is  the  same  for  all 

components  of  u,  r^^  the  transform  of  r^^  is  a scalar  multiplier  and  emerges  from  the 
homogeneous  error  term  in  (2.18)  as  a factor.  Thus  let  It  denote  the  d-component  transform 
variable  and  )fx  its  inner  product  with  the  space  variable  x:  then  with  (5,  L,  etc. 
denoting  transforms,  we  need  to  compute  e()<h).  where 

L^/i  = 1 - £(l<h)  . (2.24) 

Equation  (2.18),  with  f ta)cen  as  r,  f,  transforms  to 

h h 

3^e  - L,e,  = r,  [e  ()th)  L(lt)  u ()t)  ] , (2.25) 

t h h h h 

and  the  possibility  of  superconvergence  depends  on  E()th).  We  use  the  term  superconvergence 
as  in  Dupont  [12]  to  refer  to  any  property  of  u that  is  matched  by  the  approximant  to 


I 
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higher  order  than  that  characterising  For  evolutionary  error  it  means  the  matching  of 

r^u  by  u^  and  will  therefore  normally  depend  on  the  choice  of  r^^:  but  in  this  simple 
case  it  does  not. 

As  pointed  out  by  Kreiss  and  Oliger  117] , any  discrete  procedure  needs  at  least  two 
grid  points  per  wave  length  to  give  minimal  accuracy  and  in  practice  e (kh)  will  only  be 
reasonably  small  for  |k|  < Tr/2h.  It  is  therefore  useful  to  split  the  truncation  error  by 
looking  for  bounds  of  the  form  |E(kh)l  £Cj^|kh|'*  for  |kh|  ^ii/2,  |e(kh)|  f.  for  all 

kh,  so  as  to  obtain  for  u = (2t!)  ‘^Je^'^"’'u(k)dk 

(2it)‘^  |t.E.|  C h^  / |r  (k)  I . |k'’£(k)u(k)  |dk  + C / | r (k) L(k) u<k)  |dk : (2.26) 

|k|<^Ti/2h  (k|^Tr/2h 

E (kh)  is  responsible  for  keeping  the  first  term  small  and  the  second  will  depend  on  the 
spectrum  u(k)  of  u and  possible  damping  of  inaccurate  modes  by  r^(k) . Then  an  error 
bound  for  obtained  immediately  from  (2.18)  and  (2.26)  with  an  assumption  such 


as  the  semi-boundedness 


of  I^:Re<e^.L^e^>j^  > -a|le^ 


If  now  L is  linear  but  the  coefficients  variable  or  the  mesh  non-uniform,  u can 
still  be  resolved  intc  its  Fourier  components  but  r^  cannot  be  taken  through  and 

j:(kh)  will  depend  strongly  on  its  choice,  now  being  defined  for  each  node  by 

(L,  r e^'^’’'). /(r  Le^’''"').  = 1 - £ . (kh)  . (2.27) 

n h j n j j 

Error  estimates  can  still  be  based  on  the  equation 

O^e-Le^).  = (2u)'‘^/u(k)£  . (kh)  (r.  Le^’'"^).  dk  . (2.28) 

thhhl  1 h I 

We  shall  see  in  an  example  in  section  7 how  still  be  defined  so  as  to  re- 

tain superconvergence  properties  present  in  the  uniform  mesh  case. 

When  L and  are  non-linear,  however,  interaction  between  modes  must  be  calculated 

and  each  case  will  have  to  be  treated  individually.  Such  studies  have  been  carried  out  for 
various  difference  methods  by  a number  of  authors  - see,  e.g.,  Grammeltvedt  [14].  We  shall 
consider  in  sections  3 and  4,  and  again  in  section  7,  the  typical  case  when  L(u,v)  is  a 
product  uv  as  in  the  advection  operator  v-^^u  . 
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3.  APPLICATION  TO  A SIMPLE  FINITE  ELEflEMT  PROBLEM 


We  take  the  space  V to  be  L (-“,<»)  and  consider  the  linear  spline  Galerkin  method 

on  a uniform  mesh  applied  to  u = Lu.  An  element  of  V.  consists  of  a set  of  nodal  values 

t h 

{U.}  at  knots  x = jh  and  its  prolongation  p^  is 


U = 5^  . (X) 

(j)  ^ ^ 


(3.1) 


where  summation  is  over  j e Z,  (x)  = iji(x/h-j)  and  (|i(s)  is  the  piecewise  linear  shape 

function  with  ((>(0)  = 1 and  = 0 for  non-zero  f c Z.  We  will  use  the  optimal  re- 

* 1 11 

striction  operator  r , which  makes  p r the  orthogonal  projector  into  the  space  spanned 

by  that  is,  if  M is  the  mass  matrix  defined  by  M . = < * ,*.>  and  u*'^'  the  vector 

3 m3  ^m  ^3 

defined  by  u*'^'  = < <)>  ,u),  then 


1 M-1  (<(') 

r u = M u ^ 


(3.2) 


In  this  case,  M has  components  2h/3  for  m = j,  h/6  for  |m- j I = 1 and  zero  otherwise. 


We  will  denote  by  '^be  nodal  values  of  r u. 


Now  caking  the  mode  u = u(k)e 


ikjc 


we  have 


<1)1^, u)  = u(k)J  e^'^*(ti  (xAi-m) dx  = hu(k)e'^"‘^ij)(-5) 


im?' 


(3.3) 


where  g = kh  and  ^ is  the  Fourier  transform  of 
projection  equations  is  obtained  as 


Hence  the  Fourier  transform  of  the 


k '2m-l"^  Wl>  = 


i.e.,  s M = ue^™^  3i(,(-5)/(2  + cos  5)  = ue^”'^a(5)  > say 

" 1-2  21 

A simple  computation  shows  that  i()(C)  = (jC)  sin  ^ and  hence 


(3.4) 


a(f,)  = 1 + ^ as  5-0 

5 (2  + cos  5) 


(3.5) 


We  will  distinguish  particular  prolongations  by  superscripts  (thus  p^  for  linear  splines) 
and  always  denote  the  corresponding  optimal  restriction  with  the  same  superscript. 
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This  confirms  the  second  order  approximation  error  of  the  linear  elements  for  a fixed  k, 
as  0(0  is  the  ratio  of  the  nodal  values  of  p^r^u  to  those  of  u. 

Next  let  us  consider  the  evolutionary  error  when  L = We  have  for  the  nodal  values 

Ilex 

when  u “ ue  , 


(r^Lu)j  “ ik(r^u)j  « ikue^^^a((;) 


(3.6) 


The 


discrete  Galerkin  operator  equals  Lp  , or  M K in  terms  of  the  stiffness 


matrix  K - ( 4 .Ld.):  under  Fourier  transformation  it  becomes  M K where  K = i sin  5, 
m]  m ’^3 

so  we  have 


'V 


(r^r^u).  - M'^K(r^),  - Ge^^^a(£) 


(3.7) 


The  ratio  of  the  nodal  values  (3.6)  and  (3.7)  giv^s  then 


(r^Lu).  ■ <2  - O 


3 1 sin  £ . , £ 

— ^ — r ^ 1 - hrr  as  £ 

i . r»  j^90  ■> 


(3.8) 


so  defining  e(kh)  in  (2.24).  Whatever  discrete  norm  Ij  • Hj^  is  chosen,  this  shows  that 

4 

the  evolutionary  error  is  0(h)  for  a given  k mode.  Moreover,  any  restriction  operator 
could  have  been  used  in  the  error  definition  r^u-u^:  a(^)  in  (3.4)  is  just  the  Fourier 

transform  r^  of  r^  and  would  be  changed  to  r^,  but  the  same  factor  would  appear  in  (3.6) 
and  (3.7)  and  cancel  in  (3.8). 

To  obtain  precise  error  bounds  for  a general  u will  however  depend  on  the  choice  of 
1-2 

r. . For  r , the  E in  a(l)  provides  some  damping  for  the  higher  modes.  Putting  this 
h 

with  (3.8)  , which  is  quite  accurate  up  to  ^ = 11/2,  we  can  show  for  this  case  (L  = 3^  and 


2it 


T.E.|  < Cj^h^  j |k^u(k)|dk  Iku(k)  |dk 


(3.9) 


kel 


n=l  ke I 


where  I = {kimi  < 2|k|h  < (n+Dn)  and  C * 0-0090,  C *=  4-9 

n — — 1 A 
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f I 
^ ' 

■! 


The  choice  r is  even  more  important  for  a non-linear  operator,  so  let  us  now  con- 
sider L(u,v)  = - u3  V,  with  L.  again  the  Galerkin  operator.  The  discrete  eguations 
X h 

W = L^(U,V)  5 m”^K(U,V)  become 


h(W.  +4W.+W.  ,)  + (2U.+U.  ,)(V.-V,  ,)  + (2U,+U.  ,)(V.  -V.)  = 0 . 

1-1  1 3+1  3 3-1  3 3-1  3 3+1  3+1  3 

The  interaction  of  Fourier  modes  is  exhibited  by  putting  v = ve^*'  * to  give 

- ilsin  5’+4  sin  j ^ cos  | 5 cos  j (e+£')] 

[Lh(A,r^,].  = M2  t-c^FTFcMl ^ 

On  the  other  hand 

Ir^L(u,v)K  = - i k'  uve^^*^*^  'o(£+C') 

and  denoting  the  ratio  (3.11)  to  (3.12)  by  y we  have 

G 

Y(,(e,5')  1 - -4£'‘*  )/720  as  Z-C  +0 

= 1 - 175'*/720  , when  C = C’  • 


(3.10) 


(3.11) 


(3.12) 


(3.13) 


Thus  the  fourth  order  accuracy  is  retained  through  the  non-linear  interaction  under  the  opti- 
mal projection. 

Clearly  Yq  contains  a factor  a(£)  a(£;' )/a  (5+E; ' ) which  will  differ  for  different 

choices  of  r^^  in  the  definition  of  evolutionary  error.  In  the  present  case,  this  factor 

behaves  like  1 - ^ CC'  ^"<1  cancels  a similar  factor  in  (3.11)  to  fourth  order.  If,  however, 

the  restriction  operator  r'^  corresponding  to  point  collocation  were  used,  a(f,)  r 1 so 

that  no  cancellation  occurs,  Yq  '''  1 ^ this  definition  the  Galerkin  procedure 

for  the  product  is  only  second  order  accurate.  This  is  the  interpretation  used  by  Swartz  and 

Wendroff  [29] , who  therefore  advocate  simple  multiplication  of  grid-point  values  for  evaluat- 

c 

ing  a product:  under  r that  operation  is  then  exact  of  course.  We  shall  return  to  this 
point  again  in  section  7. 
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4.  EVOLUTIONARY  ERROR  IN  THE  SPLINE-GALERKIN  METHOD 


In  [32)  and  [33)  Thoinee  and  Wendroff  analysed  the  precise  behaviour  of  the  semi-discrete 
Galerkin  method  based  on  B-splines  of  order  y,  when  applied  to  linear  differential  opera- 
tors on  the  real  line.  In  (33)  they  showed  th;t,  for  the  periodic  problem  » Lu  on 

(0,1),  an  accuracy  of  0(h'^)  was  obtained  at  equally- spaced  mesh  points  when  L was  of 
00 

order  m with  C coefficients  and  initial  data,  where  v • 2y-m  for  m even  and 
V = 2u-m+l  for  m odd  and  y >_  (m+2)/2.  Their  analysis  uses  a quasi-interpolant  so  that 
effectively  they  take  the  restriction  to  be  collocation  at  the  mesh  points  while  prolongation 
produces  an  expansion  using  these  coefficients  and  basis  functions  which  are  linear  combina- 
tions of  shifted  B-splines. 

2 

Working  in  L , we  shall  use  here  the  optimal  restriction  operator  and  show  that  the 
product  operation  then  has  accuracy  OCh^*^).  Since  it  is  clear  from  the  above  that  3^  is 
approximated  to  this  accuracy  for  any  restriction,  we  shall  then  be  able  to  show  that  quite 
general  non-linear  differential  equations  can  be  approximated  to  0(h^'')  by  using  a success- 
ion of  spline-Galerkin  projections.  The  analysis  closely  follows  (32)  with  only  minor  changes 
in  notation. 

Let  X be  the  characteristic  function  of  the  interval  [- j , j ) . Then  the  B-splines 
of  order  V on  a uniform  mesh  with  spacing  h can  be  defined  as  ♦^(x)  - ♦(x/h-j),  where 
♦ = X*X***"*X  (U  factors)  and  * represents  convolution.  The  Fourier  transform  of  ♦ is 


i(C)  = [x(C)]^  = ((j  O’^sin  I 5)*' 


(4.1) 


and  Thomee  (32)  introduces  the  trigonometric  polynomials 


...  , ,,0-2v  0-1  r -im?r  ^o-v  u 

g„  _(S)  = (-1)  h i e J 3 ♦ (x)dx  , 

'^y,o  —X  0 xm 


(4.2) 


where  v = [y  a)  and  [y  (o+l))  < y,  and  summation  is  over  m € Z:  he  proves  that 


° ^ (5+2itm)°[i((;+2Trm))  . 

(m) 


(4.3) 


In  particular,  note  that  hg^  0^^^  Fourier  transform  M of  the  mass  matrix  formed 

from  the  It  is  also  easily  seen  that  there  is  a constant  Kj^  such  that 
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r 


! 


[x(C) 


< 1 


+ K^(C/'') 


2u 


for 


(4.4) 


From  these  definitions,  the  nodal  parameters  obtained  from  projecting  a Fourier 

component  lie  with  5 = )ch  are  given  by 

/ ^’1  0.1*.:  (x)  - ue''’’^e^''’'}  4>_(x)dx  , Vm  e Z , (4.5) 

(i)  ^ 2-'" 

from  which  we  see  that 

Qj  = ue^25a(£;)  , where  a(C)  = iU)/q^  ^(C)  . (4.6) 

Hence 

[1  + Kj  (€/’')  < [X(4)j‘'a{C)  < 1 , for  |C|<1T  . (4.7) 

Also,  if  L = 3^,  then  writing  simply  p for  the  prolongation  in  this  space  and  r for  the 
optimal  restriction,  = rLp  with  Fourier  transform  K/M,  so  that 

(L  ^u)  = (ic/M)  (ru)  . = M"^(ru),  /<(.  (x)  $ ' (x)  e^^^dx 

hi  7 3 m q 


= ih  ^ [g^  l(C)/g^  j ■ (4-8) 

2u 

It  is  easily  deduced  from  (4.3)  that  g^  q = CI1  + 0(E  )]  as  5-*0  so  that  we  have 

2u 

0(h  ) accuracy  for  the  Galer)cin  approximation  to  3^  . 

Now  let  us  consider  the  product  operation. 

Lemma  4.1.  If  L(u,v)  s uv  and  (Lj^r)  (u,v)  is  defined  as  r [ (pru)  (prv)  ] , then  for 
2\i 

U,  V c H 

rL(u,v)  - (Lj^r)  (u,v)  s=  0(h^’^)  , as  h ->■  0 . (4.9) 

Proof 

i kx  ^ i k * X 

Ta)<ing  first  single  components  ue  ,ve  with  C = )ch,n  = ){'h,  the  nodal  para- 
meters for  Lj^r  are  given  by 
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(4.10) 


/y  (x)dx  = jy  ue^'"^a(5)  iji  (x)  ^ ve^'’'^u(  n)  (1  (x)  o (x)dx  . 

a .V  ni  .»  n q 

<3)^  ^ ^ Im)  (n) 

If  we  introduce  the  factor  y(^,n)  by  putting 

= uve^^ ' a(  ?)  a(n)  Y<C' n)  < (4.11) 

we  obtain 

hg  n(C+n)Y(e,n)  = /I  (x)y  e^""*  (x)*  (x)dx  . (4.12) 

(m)  (n) 

The  triple  product  can  be  evaluated  by  Poisson's  suxination  formula 

I (X)  = I ♦(et2,j)e"<®^2,j)x/h  ,4  ^3, 

(m)  (j) 

to  give 

g f,(5+n) Y(?*h)  ” JI  I $(£+2inn)i{n+2'itn)e  ^ 2'n(m  "'1®* 

(m)  (n) 

“I  I ♦(  £+2'>im)  i (ri+2nn)  M t+n+2ii  [m+n] ) . (4.14) 

(ra)  (n) 

Clearly  y li)te  g^  ^ is  2Ti-periodic  in  its  arguments  and  from  (4.1)  we  see  that 

1 < tx(5))<(n)x(C+n)r'g  .(5+n)Y(Crn)  11  + K,((?A)^''+  (4.15) 

for  Is],  |n|  ^ n and  some  constant  Kj.  Now  (rL) ^ contains  the  factor  a(C+n) , so  if  we 
denote  by  1 - e(?,n)  the  ratio  (L^r)^/(rL)j  we  have  from  (4.6),  (4.11)  and  (4.14) 

e(C,ri)  = 1 - a(£)a(n)  lx(6+n)  1 ’^g  n(€+n)Y(5,n)  . (4.16) 

p , u 

From  the  bounds  (4.7)  and  (4.15)  it  follows  that 

|e(e,n)  I 1 K3((C/ii)^''+  (n/ir)^‘'),  for  |c|,|n|<it  . (4.17) 

Integrating  now  over  all  components  of  u and  v,  and  noting  that  both  |1l(u,v)|( 
and  l|L(pru,prv)  ||  are  tx>unded  by  |lul|  Hv]!  , we  have 
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rL(u,v)  - L^r(u,v)  = rluv  - (pru)(prv)] 


= /dk"  r I G(k)e^''’'v(k"-k)e 


i (k"-k) 


f (kh,k"h-kh)dk  . (4.1fi) 


The  integral  over  Ik  I > -n/h,  |k'*-k|  > n/h  converges  to  zero  as  h ♦ 0 and  inside  this 


range,  (4.17)  gives  the  required  result  with  the  constant  including  a factor  Hull  _ l|vlL  . 


From  this  bound  and  the  similar  one  for  the  operation  of  differentiation  one  can  state 
the  following 

Theorem  4.1.  Suppose  Lu  has  the  form  ^...L^u,  where  each  consists  either  of 


differentiation  3^  or  multiplication  by  u or  a C function  of  x . let 


“h  '"'Si’  each  = rL  p:  that  is,  if  L v = 3^v  then  = rS^prv  and 


if  L^v  = uv  or  f(x)v  then  = r(pru) (prv)  o£  r(prf) (prv) ; p and  r are  prolonga- 


tions and  restrictions  for  splines  of  order  y > 2.  Then 


rLu  - = 0(h  ) as  h -► 


(4.19) 


The  proof  follows  immediately  from  the  lemma  and  the  earlier  bound  for  differentiation 
after  the  decomposition 

rq-1. 


rL  - Lj^r  = (rb'^-L^f)  l'^'^  • -L'  + 


,q  ,q-s+l,~,s  ,S',,s-l  1 

•••  + (rL  -L|^r)L  -"L  + ••• 


+ L^*  •Lj;j(rL‘-L;r) 


,2,-,1  ,1-> 

Lhr' 


(4.20) 


Thus  the  full  order  of  accuracy  is  preserved  no  matter  how  many  derivatives  and  products  are 


taken.  In  particular,  for  the  linear  splines  p = 2,  the  usual  loss  of  two  orders  of 


accuracy  as  one  goes  from  3^  to  3^  is  avoided  by  carrying  out  a projection  between  the 


two  derivatives:  the  resulting  scheme  is,  of  course,  much  less  compact  than  the  standard 
scheme  and  the  fourth  order  accuracy  in  this  case  will  normally  be  better  preserved  by 
'half-lumping'  the  mass  matrix.  In  section  7,  however,  we  show  how  the  intermediate  pro- 


jection is  valuable  in  the  advection  operation  u 3 u. 

X 
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One  final  remark  on  these  methods  - the  L norm  clearly  plays  an  important  part  in 
the  thinking  and  3^U  is  constructed  as  a best  approximation  in  that  norm.  However,  when 
the  final  set  of  nodal  parameters  u||  is  obtained,  an  optimal  prolongation  could  be  sought 
with  respect  to  another  space.  For  example,  suppose  that  with  linear  splines  a uj^  was 
obtained  very  close  to  r^u”.  Then  the  best  estimate  of  any  functional  of  u*^  from  this 
data  would  depend  on  the  smoothness  assumed  in  u*^.  For  instance,  nodal  values  might  well 
be  smoothed  out:  this  would  be  consistent  with  the  observation  from  (3.5)  that 


I (r  u)^/u^l  > 1 for  all  Fourier  modes  with  0 < kh  ^ n.  (Compare  the  viewpoint  in  the  field 
of  optimal  recovery  - see  Golumb  and  Weinberger  [13],  Meinguet  [18],  Micchelli  and  Rivlin 
[19],  and  references  therein.) 
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5.  INTERPRETATION  OF  STANDARD  DIFFERENCE  METHODS 

In  a sense  a finite  difference  scheme  is  not  concerned  with  prolonqation  operators  and 
remains  entirely  in  the  discrete  space.  However,  the  standard  stability  theory  always 
supposes  some  embedding  of  the  approximants  into  the  solution  space  of  the  differential  pro- 
blem and  Raviart  [22]  does  this  in  a manner  which  is  directly  antecedent  to  Aubin's  analysis, 
.'toreover,  the  choice  of  the  initial  data  implies  the  use  of  a restriction  operator  and  the 
interpretation  of  the  final  results  implies  that  of  a prolonqation  operator. 

The  simplest  and  most  convenient  operators  from  a theoretical  viewpoint  are  those  used 
by  Raviart:  the  spatial  region  is  sub-divided  into  rectangular  cells  and  r*^u  is  de- 

fined as  the  vector  Q = (O^)  obtained  from  averaging  u over  each  cell:  the  corresponding 


0 T* 

prolongation  p 0 = ^ 0.9. (x),  where  9.  i 

(j)  ^ ^ 


s the  characteristic  function  of  the  cell  C.. 


Then  p*^  and  r^  are  mutually  optimal  in  : see  Temam  131]  for  developments  in  other 
spaces.  This  viewpoint  is  often  used  in  a loose  way  in  direct  modelling  of  conservation  laws, 
especially  in  fluid  dynamics.  However,  in  constructing  such  difference  schemes  these  cell 
averages  have  to  be  inter-related  or  related  to  values  at  intermediate  points  and  it  is  then 
that  inconsistencies  in  the  approach  are  often  revealed,  as  compared  with  Temam's  rigorous 
development . 

It  is  much  coinnoner  in  practice  to  regard  the  numbers  in  a difference  calculation  as 
representing  grid-point  values  of  the  unknowns  - and  this  is  usually  how  the  initial  values 
are  chosen:  that  is,  the  restriction  r is  a collocation.  It  is  interestinq  then  to  con- 

sider  what  the  optimal  prolongation  p should  be  to  recover  maximum  information  at  each  time 

c 2 

step.  From  (2.5)  p must  be  interpolatory . Also,  Aubin's  results  do  not  apply  in  L 

because  r'^  is  not  defined  as  a continuous  operator  there:  we  need  to  work  in  an  under- 
lying Sobolev  space  where,  by  Sobolev's  lemma,  p > d/2.  Because  p'"  has  to  inter- 

2 

polate,  the  norm  in  (2.5)  under  which  the  infimum  is  taken  could  in  fact  be  just  the  L 
norm  of  the  p^*^  derivatives.  The  solution  to  this  problem  is  then  well-known  in  the  case 
of  a finite  interval  of  the  real  line  to  be  given  by  natural  spline  interpolation  (see 


de  Boor  (5),  Schoenberg  (241)  and  more  generally  will  involve  generalised  splines  (see,  e.g. 


Schultz  and  Varga  [25]). 
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It  is  not  being  suggested  here  that  such  procedures  should  be  used  in  practice  but 
that  these  relationships  should  be  borne  in  mind  when,  for  instance,  designing  contour- 
plotting packages:  when  processing  the  output  of  finite  difference  procedures  one  would 
expect  them  to  be  interpolatory ; but,  as  noted  above,  when  processing  that  from  finite  ele- 
ment schemes  we  could  expect  some  smoothing  of  nodal  values. 

Let  us  consider  now  very  briefly  how  three  of  the  most  important  classes  of  difference 
scheme  fit  into  the  present  framework.  The  Crank-Nicolson,  or  more  generally  the  e-method, 
applied  to  3^u  = Lu  leads  to  an  evolution  operator 


where  is  a central  difference  approximation  to  L and  some  mild  restriction  on  At 


may  be  needed  to  solve  for  u. 


n+1 


when 


is  non-linear. 


Then 


[Ij^-ateLj^]  (At)"^[r‘^E(At)-Ej^r‘^]u'’ 


= (At)  [Ij^-At9Lj^]r‘^u"*^-[Ij^+At(l-e)L^lr'^u'''} 


(5.2) 


is  the  usual  expression  for  the  truncation  error  in  a finite  difference  analysis.  Similarly 
a Lax-Wendroff  scheme  may  use  two  difference  approximations  together  with  an 

averaging  operator  A to  give 


I + [A  + I 


(5.3) 


and  more  general  Runge-Kutta  schemes  may  be  expressed  in  the  same  way,  all  giving  a defini- 
tion of  truncation  error  identical  with  the  usual  one. 

Multi-level  schemes,  like  the  leap-frog,  require  a little  more  care.  We  define  V, 

h 

to  represent  discrete  approximations  at  two  time  levels  x At  apart,  with  r defined  bv 

2 h 

the  procedure  used  to  obtain  approximations  at  t=0,  t=  i At  from  the  initial  data.  The 
two  meshes  may  or  may  not  be  staggered  and  we  may  cover  the  generalised  case  in  which  the 
two  time  levels  are  updated  differently  by  using  two  difference  operators  • 

Denoting  by  components  of  u^  at  the  two  levels,  we  have 
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t<2) 


/ (1)  \ 
h \ 


2 (1)  (2) 

If  L is  linear  and  -n  is  an  eigenvalue  of  L.  L,  , then  the  eigenvalues  of  E.  are 

h h h 

I 

eig.  = 1 jf  ipAt  (l  - (j  uAt)^]  - j , (5.6) 

the  upper  sign  giving  the  approximating  modes  and  the  lower  the  familiar  spurious  modes, 

when  all  variables  are  held  at  each  level.  The  error  (At)  [r^u'^^^-E,  r,  u"^]  is  then  best 

h h h 

studied  by  resolving  r^^u  into  the  two  sets  of  modes:  for  example,  when  u is  scalar  and 
r(l)  r(2) 

U = L , one  gets 


<2)  (1)  r,  A , 1 . 

% = "h  = i ■ *2  ^ 2 


that  is,  the  true  mode  represents  u^^'  half  a time-step  ahead  of  u,^^'  and  the  spurious 

n h 

(1)  (2) 

mode  having  u and  u almost  equal  and  opposite  in  sign.  As  is  well  known,  such  a 
h h 

linearised  analysis  carried  out  for  Lu  = ~ shows  the  spurious  mode  to  be  not  only 

travelling  in  the  wrong  direction  but  to  be  amplified  when  the  true  mode  is  damped  and  thus 

capable  of  generating  a non-linear  instability. 

Each  of  these  sch€  -s  may  be  "hybridised”  by  using,  for  instance,  Galerkin  methods  to 

generate  L^.  The  wave  equation,  treated  as  a system  by  leap-frog,  is  a useful  example. 

Unstaggered  linear  elements  under  the  Galerkin  procedure  give,  of  course,  the  asymptotic 
4 

error  (1/180)C  of  (3,8):  if  the  two  meshes  are  staggered  this  is  reduced  by  a factor 


- 13/32  but  at  a cost  of  reduced  stability. 
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6.  A PETROV-GALERKIN  METHOD  FOR  3 u = a3  u 
t x_ 

2 

If  u IS  qiven,  the  Galerkin  approximation  to  Lu  is  optimal  in  the  L sense  so 
that,  in  a semi-discrete  method  or  an  explicit  difference  method  in  time  as  At  -►  0,  the 
Galerkin  equations  aive  an  optimal  approximation  to  the  time  derivative  or  difference  of  u. 
But  for  an  implicit  difference  method  one  has  to  approximate  the  solution  to  equations  like 
u''^^-9AtLu''^^  = q'^.  If  L is  not  self-adjoint  it  is  likely  that  the  Galerkin  equations 
will  give  a poor  approximation  to  Thus  experiments  for  similar  equilibrium  problems 

are  being  conducted  to  find  appropriate  test  functions  for  a Petrov-Galerkin  approach  - see 
12]  and  [15]  and  references  therein. 

Suppose  test  functions  are  used  to  give  an  approximation  to  3^u  = Lu  by 


- eitLu"'^^  - [U^+d-e)  AtLu")  ,X^>  = O . (6.1) 

Then  the  operator  (I  - SAtL)  in  the  inner  product  defines  a new  restriction  operator  s^^ 
for  obtaining  The  evolution  operator  in  (2.8)  is  given  by 

E,  = s^[l  + {l-0)AtLlp^  , (6.2) 

h h h 

while  the  r.  occuring  in  the  truncation  error  of  (2.23)  will  normally  be  different.  Even 
h 

for  the  explicit  case  0=0,  it  may  be  useful  to  use  the  Petrov-Galerkin  method. 

We  consider  here  the  case  when  Euler's  method  is  to  be  used  on  u = a3  u with  linear 

t X 

elements  on  a uniform  mesh  and  p^ , r^  defined  as  in  section  3.  Now  it  is  well-known  that 

2 

Galerkin's  method  gives  a scheme  which  is  unstable  unless  At  = 0(Ax)  . On  the  other  hand, 
when  a is  constant  the  characteristics  can  be  followed  exactly  and  there  need  be  no 
evolutionary  error.  So  we  consider  whether  there  is  a choice  of  which  will  achieve 

this  result. 

Lemma  6.1.  Suppose  the  basis  functions  <ti^  are  such  that  we  can  set 


3^i]idx)  = iT^_^(x)  - TjCx) 


4'.  (x)  = ♦ . (x)  - ■ . (x) 

1 3 3 


(6.3) 


and  "^(x)  = i'(x/h-j)  for  some  n(x).  Then  the  Euler-Petrov-Galerkin  method  will  follow 
the  characteristics  of  3^u  = a3  u for  aAt  = h iff. 

t X 


( Xi  . 't’j>  = 0 i.  j 


(6.4) 
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Then  if  , we  need  U*^  ^ ^ aAt  = h-  Thus  the  following 

should  be  an  identity  in  or,  equivalently,  in  Pj+j  “ Qj 

< “<2j+l'"j’*j  ■ ' Xi>  = 0 . Vi  . 

The  coefficient  of  is  just  1(1^  (x)  from  (6.3)  so  the  result  follows  immediately. 

For  linear  basic  functions  4’^  > <*  4’^  = (4?  ,~4?)/h,  where  4*?  is  the  characteristic 

3 X J 3-1  3 3 

function  of  the  interval  (jh,(j+l)h],  so  that  4^  = ’^'^'4?.  Moreover,  4^  is  itself  the 
difference  of  two  triangular-shaped  functions,  4^  = s^  l~^j  ”^®x'e  s . (x)  = x/h-j  in 
[jh,(j+l)hl  and  s^(x)  = 0 otherwise.  Thus  we  need  < , s3  to  be  constant,  indepen- 
dent of  j.  Clearly  if  X is  to  be  of  the  form  X (x)  = X(x/h-m)  with  X of  compact 

mm 


support,  we  need 


We  choose 


/ tx(t)dt  = 0 , V 

•'  m 


X(t)  = 4-6t,  for  t c 10,1]:  X(t)  = 0,  otherwise  . (6.6) 

Then  /xdt  = 1 and  for  constant  a > 0 we  ta)te  test  functions  which  are  a linear  combina- 
tion of  X.  and  4!".  We  must  first  establish  stability. 

11 

Theorem  6.1.  The  Petrov-Galerjtin  scheme  given  by 

< (u"  + aAt3  u")  , (l-v)4^  + vX.)  = 0 , V i , (6.7) 

X 11 

is  stable  for  constant  a on  a uniform  mesh  if 

0 < aAt/h  < V <■  1 . (6.8) 


We  shall  need  the  following  results  which  can  be  easily  computed. 


Lemma  6.2.  For  the  basic  functions 


i . , * . and  x . we  have : 
3 3 3 


0 1 0 0,.  1 
( X.  ,4j>  = < X.  ,4j>  = < 4i-4j>  = h(5.^  = 4<X^.Xj>  . 


Denote  by  4^  the  test  functions  in  (6.7),  defining  a restriction  operator  s^^.  Then 

we  form  first  V = p ^s,  (a3  u'^)  , multiply  equation  (6.7)  by  the  coefficient  (U?^^+u'!'+AtV. ) 

hx  1-/1  111 

and  sum  over  i to  get 
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0 


(6.10) 


(o"+aAt3u"),  I, . , u"+AtV. ) $.)  = 

X (l)  1 111 

0 ^ 

I,et  us  denote  by  U the  vector  of  nodal  values  (U^)  and  by  p U,  p U,  p^U  its  pro- 
logations  using  basis  functions  {X^}  respectively:  we  also  retain  the  nota- 

tion U for  P^U-  Then  (6.10)  can  be  written  as 

< (u"+aAt3^u")  , l(l-v)p^+vp^]  u"+  AtV)>  = 0 


(6.11) 


and,  from  the  lemma,  the  cross  products  between  and  are  symmetric  and  therefore 

cancel  to  give  a difference  of  squares.  Furthermore,  from  the  construction  of  V it  follows 
that  for  any  vector  W, 

<V-a3^u",  p'^’w)  = 0 (6.12) 

So  (6.11)  reduces  to 

(d-v)  + vl|p°u"'^^ll^]  - I(i-v)  |lu"ll^+ vIlpVll^) 


= At(a3^u",  I(l-u)p^+vp^]  (2u''+AtV)  > . 

To  simplify  the  right-hand  side,  note  that  that 


(6.13) 


l|h3ytpV||'=  .. 

expanding  the  sum  gives  therefore 

2<3^o",  p\")  = - h||3^u"|)^  . (6.14) 

For  the  final  term,  we  have  from  (6.12) 

(a3^u”,  p%)  = (pV  P*V>  = (1-v)  ||p^v||^+  v||p°v||^  , 

while  from  the  Cauchy-Schwarz  inequality 

2<a3y,  p*V>  < (l-v)||p^ll2+  vIIp’^vII^  llaayil^  , 
so  that  together  we  have 

(a3^u",  p'’’v)<  ||a3^u"|(^  . (6.15) 

Putting  (6.14)  with  (6.15)  gives  for  the  right-hand  side  of  (6.13), 
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1 


((aAt)^  - yhaAt)  ||3^u"||^ 

so  that  stability  follows  for  uh  ^ aAt  . 

Next,  we  can  consider  accuracy  and  have  the  following: 

Theorem  6.2.  The  Petrov-Galerkin  scheme  given  by  (6.7)  is  second  order  accurate  if 
u = ait/h  = constant  < 1 . 


We  can  use  Fourier  analysis,  putting  u = ue  . Then  writing  = aAt/h  and  with 


a(C)  given  by  (3.5) , 


(r^E(At)u)^  = ua(C)e^^^^'^''’ 


The  mass  matrix  for  (6.7)  has  transform  M = h [ j (1-u) (2+cos  C)  + v] , and  we  have 
(Ej^r^u)^  = ua(C)e^^^e(S) 

where  8(5)  = 1 + aAtw”^ I (1-v) i sin  C + v(e^^-l))  . ( 

The  essential  factor  in  the  truncation  error  is  therefore  given  by 

(At)'^re^^^'-6(5)]  -v  (At)”^[i  p(v-u)5^+  I iu(v-u^)t^+...]  ( 

as  ? -*■  0 

The  scheme  is  thus  first  order  for  v > p,  second  order  for  p = v < 1 and  exact  for 


It  may  be  noted  that  this  scheme  with  v = p closely  resembles  the  Lax-V'Jendrof f 

a 

method  for  this  equation  (or  any  other  method)^  In  fact,  the  terms  in  iP  *n  (6.7) 

p 

are  exactly  the  same  and  it  is  only  the  presence  of  the  mass  matrix  which  distinguish  the 
schemes:  it  has  the  effect  of  marginally  improving  the  accuracy,  for  the  coefficient  of 

3 2 2 

iC  in  (6.18)  is  u (l-p)/6,  while  for  Lax-Wendroff  it  is  ud-p  ) /3  . 


F 


7. 


APPROXIMATION  OF  THE  ADVECTION  OPERATOR 


In  the  foregoinq  sections,  we  have  identified  three  easily  implemented  approximations 
to  us  u,  all  based  on  the  Galerkin  method  with  linear  elements  but  distinguished  by  the 

X 

way  in  which  the  product  is  formed.  The  definition  of  the  methods  is  contained  in  the  ex- 
pression for  the  truncation  errors,  in  a semi-discrete  process  and  with  restriction  r^^,  as 
follows:  the  single-stage  Galerkin  method  (SSG)  gives 

r,  (u3  u)  - r^[(p^r  u)(a  p^r  u)l  ; (7.1) 

fl  X j)  X jl 

the  double-stage  Galerkin  method  (DSG)  gives 

r.  (u3  u)  - r^  Up^r,  u)  (p^r^a  •’  (7.2) 

h X h X h 

and  point  multiplication  Galerkin  (PMG)  gives 

r.  (ua  u)  - (r.  u)  (r^a  p^r.u)  . (7.3) 

h X h X h 

1.  c 

The  natural  choice  for  r,  in  the  first  two  cases  is  r , while  r is  probably  most 

h 

natural  for  the  third.  For  the  purpose  of  comparison  then,  we  tabulate  for  each  case  under 
both  choices  the  leading  term  in  e(5,£)  for  a single  mode  as  introduced  for  (4.16).  The 
entry  for  PMG  under  r follows  itmediately 


Restriction 

C 

r 

SSG 

IB 

DSG 

PMG 

mm 

Table  Leading  terms  in  €(6,0  for  the  three  methods 

1 c 

SSG,  DSG,  PMG  under  the  restrictions  r and  r . 

Q 

from  (3.8)  because  point  multiplication  is  exact  under  r and  there  is  no  interference 

c c c 1 

between  the  processes  because  r (u8^u)  s (r  u) (r  3^u) . The  DSG  entry  under  r is  new 
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and  is  also  the  smallest:  comparing  with  (3.13),  it  results  from  a which  behaves 

like  1 + (24^5’  + 3£^C'^+  25£  ' ^-4t ''‘)/720. 

For  smooth  functions  and  under  the  restriction  r^,  we  would  expect  therefore  that 
DSG  would  be  up  to  nearly  six  times  better  than  SSG.  This  will  depend  on  the  spread  of 
inodes  in  u since  the  total  truncation  error  at  node;  is  of  the  form 

T.E.  = /dke^’‘^\(kh)  /d<  j(k-K)  u(ik +jic)Ii(ik- j<)  e(e+6,  9-6)  , (7.4) 

where  we  have  put  £+£'  = 20  = kh  and  C-£'  = 26  = ich.  Thus  we  compare 

e^^^(9+6,  0-6)  'V  (170‘*-36  0^5+100^  ^+  406^+  56‘*)/720  (7.5) 

SSG 


and 


c„^^(0+6,  0-6)  -v  - (39'‘+160^6  - 300^6^+1606^-55'*) /720  : 
DSG 


(7.6) 


— y)c 

for  a normally  distributed  spectrum,  u(k)<x  e , these  expressions  would  be  multiplied 
2 2 

”Y(k  )/  2 

by  e before  being  integrated  over  K:(or  6),  so  that  the  leading  terms  in  (7.5) 

and  (7.6)  would  be  dominant. 

By  contrast,  under  these  conditions  PMG  would  be  quite  uncompetitive.  On  the  other 


hand,  considering  the  error  r u-u,  under  the  restriction  r in  this  case,  we  need  to 

h 

compare  (7.5)  and  (7,6)  with 


e-fi)  4(9‘*-49^6+69^6^-466^+6‘*)/720 

PMG 


(7.7) 


which  shows  it  to  be  quite  comparable  to  DSG.  The  choice  between  these  methods  must  there- 

Q 

fore  come  down  to  a choice  of  restriction.  With  r , the  factor  a(kh)  in  (7.4)  is  miss- 
ing and  this  becomes  significant  when  contributions  from  kh+2n'T  for  all  n are  combined 
in  the  coefficient  to  Thus  suppose  we  write  in  each  case 

E(k)  = maxUI  . /dK|i(k-<)  u(|k+ jtc)  G(ik- iol  . (7.8) 

Then  the  coefficient  of  in  the  truncation  error  is  )DOunded  by 

, a(kh+2nii)E(k+2n’i/h)  : (7.9) 

(n) 
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for  PMG  under  r*^  this  is  an  undamped  sum  over  the  E values;  but  for  DSG  (or  SSG)  under 

r^,  a(C)  is  rapidly  d2imped  for  |c|  > it,  being  reduced  by  a factor  nine  by  C = 3ii/2 

-2 

and  behaving  asymptotically  like  ^ . Such  modes  beyond  the  resolution  of  the  grid  are 

continuously  created  by  the  product  operation  and  result  in  the  phenomenon  of  aliasing:  a 
sharply  damped  a is  essential  for  their  suppression.  For  the  spl ine-Galerkin  methods  of 
section  4,  a(5)  '''  S and  for  spectral  methods  (Orszag  [21])  a(C)  = 1 for  lc|  ^ ” and 

a(C)  = 0 for  U1  > It. 

One  further  factor,  besides  the  behaviour  of  errors  for  small  E and  the  suppression 

of  aliased  modes  for  large  5,  needs  to  be  considered  in  comparing  methods.  That  is  the 

growth  rate  of  errors  as  expressed  by  the  second  term  on  the  left  of  (2.17)  or  the  stability 

bound  in  (2.22).  For  SSG  one  will  have  the  same  energy  conservation  properties  as  for  the 

differential  problem  emd  the  error  growth  can  also  be  analysed  in  a similar  way:  for  example, 

in  the  model  problem  3 u + u3  u = 0,  one  has  for  two  semi-discrete  approximations  U and 
t X 

V on  the  whole  real  line 

^||u-v||^=  - . U-V>  = <U^-V^  , 3^(U-V)> 

= j ( 0+V,3^(U-V)^>  = - i<3^(U+V)  , (U-V)^>  . (7.10) 

Thus  the  deviation  grows  only  when  the  mean  solution  is  "compressive”.  On  the  other  hand, 
in  obtaining  the  greater  accuracy  with  DSG  one  has  sacrificed  energy  conservation  and  this 
result  on  error  growth.  For  energy  conservation  one  has  now,  where  are  the  forward 

and  central  difference  operators, 

(3^.U  + U(p^r^3^U)  ,U>  = 0 , 

i.e.,  ^ I ||u|l^  = (d-p^r^)  3^u,u^>  , (7.11) 

~ (1/72) (A^U^)^(6^A^U^)  : (7.12) 

4 

thus  energy  grows  in  a typical  compressive  wave-front  at  a rate  which  is  0(h).  Similarly 
instead  of  (7.10)  one  has 
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^l|u-vl|^+  |o^(U+V)  ,(U-V)^>  = 2<U(l-p^r^)  3^U-V(l-p^r^)8^V,  U-V) 

-v  (1/36)  [(A^U.)  («^A^U.)-{ft^V^)  (6^a^V,)]t^(U^-VJ  , (7.13) 

which  is  much  less  useful.  However,  in  numerical  comparisons  with  the  shallow  water  equa- 
tions on  a two  dimensional  triangular  grid,  Cullen  [3,4]  has  found  DSG  to  be  significantly 
more  accurate  th„.i  SSG  and  both  much  better  than  PMG:  he  also  noted  that  the  two  simpler 
Galerlcin  processes  in  DSG  could  be  inmplemented  more  efficiently  than  the  one  process  in 


Finally,  let  us  briefly  consider  the  effect  of  an  irregular  mesh.  With  linear  ele- 
ments, the  Galerlcin  method  gives  for  V = r^B^p^U  the  equations 

(|v.,+  iv.)h.  +(iv.+  |.v.'h.  =^(U.-U.,)  , (7.1 

6 3-1  3 3 3“  3 ] 6 ]+l  1+  2 3+I  j-l 

where  h.  and  h.,  are  the  intervals  to  the  left  and  right  of  the  node.  Now 

3-3+ 

suppose  that  a co-ordinate  transformation  were  made  so  as  to  give  a regular  mesh  in  a new 


co-ordinate  y and  denote  dx/dy  by  g(y) . Then  if  g(y)  is  piecewise  constant,  p 
denotes  the  same  prolongation  in  both  x-  and  y-  space  and  the  same  vector  V is  gen- 
erated from 

<♦1  , gp^V)  = (ij,|  , 3yP^U>  . (7.15) 

That  is,  if  we  define  r^  to  lae  the  restriction  carried  out  in  y-space  with  the  weight 
function  g(y)  in  a least  squares  procedure,  V = t^(g  ^S^P^U) . Now  we  can  carry  out  a 
Fourier  analysis  in  y-space  and  with  u = ue^*^^  consider  the  truncation  error 

r.  (g  ^(y)3  )u  - (r  g ^(y)3  P^)r  u . (7.16) 

n y y Y ^ 

This  is  probaibly  simplest  when  r,  is  chosen  to  be  r : then  the  ratio  of  the  nodal  values 

h y 

-1  --1 

is  C sin  ^(M  j » where  C = kh  and  h is  the  y-mesh  interval;  the  tridiagonal  matrix 
M and  vector  a are  given  by 

« - . (7.1V) 
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